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^ ' Abstract 

Block diagonalization is a linear precoding technique for the multiple antenna broadcast (down- 
link) channel that involves transmission of multiple data streams to each receiver such that no 
multi-user interference is experienced at any of the receivers. This low-complexity scheme operates 
only a few dB away from capacity but requires very accurate channel knowledge at the transmitter. 
We consider a limited feedback system where each receiver knows its channel perfectly, but the 
' transmitter is only provided with a finite number of channel feedback bits from each receiver. 

m ■ 

Using a random quantization argument, we quantify the throughput loss due to imperfect channel 
knowledge as a function of the feedback level. The quality of channel knowledge must improve 
. proportional to the SNR in order to prevent interference-Umitations, and we show that scaling the 

number of feedback bits linearly with the system SNR is sufficient to maintain a bounded rate 
' loss. Finally, we compare our quantization strategy to an analog feedback scheme and show the 

y3 ^ superiority of quantized feedback. 
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I. Introduction 

In multiple antenna broadcast (downlink) channels, transmit antenna arrays can be used to 
simultaneously transmit data streams to receivers and thereby significantly increase through- 
put. Dirty paper coding (DPC) is capacity achieving for the MIMO broadcast channel [1], 
but this technique has a very high level of complexity. Zero Forcing (ZF) and Block Diag- 
onalization (BD) [2] [3] are alternative low-complexity transmission techniques. Although 
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not optimal, these linear precoding techniques utilize all available spatial degrees of freedom 
and perform measurably close to DPC in many scenarios [4]. 

If the transmitter is equipped with M antennas and there are at least M aggregate receive 
antennas, zero-forcing involves transmission of M spatial beams such that independent, de- 
coupled data channels are created from the transmit antenna array to M receive antennas 
distributed amongst a number of receivers. Block diagonalization similarly involves transmis- 
sion of M spatial beams, but the beams are selected such that the signals received at different 
receivers, but not necessarily at the different antenna elements of a particular receiver, are 
de-coupled. For example, if there are M/2 receivers with two antennas each, then two beams 
are aimed at each of the receivers. If ZF is used, an independent and de-coupled data stream 
is received on each of the M antennas. If BD is used, the streams for different receivers 
do not interfere, but the two streams intended for a single receiver are generally not aligned 
with its two antennas and thus post-multiplication by a rotation matrix (to align the streams) 
is generally required before decoding. 

In order to correctly aim the transmit beams, both schemes require perfect Channel State 
Information at the Transmitter (CSIT). Imperfect CSIT leads to incorrect beam selection and 
therefore multiuser interference, which ultimately leads to a throughput loss. Unlike point 
to point MIMO systems where imperfect CSIT causes only an SNR offset in the capacity 
vs. SNR curve, the level of CSIT affects the slope of the curve and hence the multiplexing 
gain in broadcast MIMO systems. We consider the case when the CSI is known perfectly at 
the receiver and is communicated to the transmitter through a limited feedback channel and 
quantify the maximum rate loss due to limited feedback with BD. 

MISO systems and ZF with limited feedback are analyzed in [5]. Similar to the results 
in [5], we show that scaling the number of feedback bits approximately linearly with the 
system SNR is sufficient to maintain the slope of the capacity vs. SNR curve and hence a 
constant gap from the capacity of BD with perfect CSIT The scaling factor for BD offers an 
advantage over ZF in terms of the number of bits required to achieve the same sum capacity. 

Rather than quantizing the CSIT into a finite number of bits and feeding this information 
back, the channel coefficients can also be explicitly transmitted over the feedback link. We 
compare this scheme to quantized feedback for an AWGN feedback channel, and show the 
superiority of quantized feedback. 
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II. System Model 

We consider a MIMO broadcast (downlink) system with a single transmitter or base station 
and K receivers or users. Each user has antennas and the transmitter has M antennas. 
The broadcast channel is described as: 



where G c is the channel matrix from the transmitter to the k user {1 < k < K) 
and the vector x e c^xi is the transmitted signal, e C^^^ are independent complex 
Gaussian noise vectors of unit variance and e C^^^ is the received signal vector at the 
k^^ user. We assume a transmit power constraint so that £'[||x|p] < P {P > 0). We also 
assume that K — ^ (with K > 2), which implies that the aggregate number of receive 
antennas equals the number of transmit antennas; as a result it is not necessary to select a 
subset of users for transmission. 

The entries of are assumed to be i.i.d. unit variance complex Gaussian random variables, 
and the channel is assumed to be block fading with independent fading from block to block. 
Each of the users are assumed to have perfect and instantaneous knowledge of their own 
channel matrix. The channel matrix is quantized by each user and fed back to the transmitter 
(which has no other knowledge of the instantaneous CSI) over a zero delay, error free, limited 
feedback channel. 

It is assumed that a uniform power allocation policy is adopted (i.e., we do not perform 
waterfiUing across streams), which is known to be asymptotically optimal for large SNR. 
Hence, in order to perform Block Diagonalization, it is only necessary to know the spatial 
direction of each user's channel, i.e., the subspace spanned by the columns of H^, and the 
feedback only needs to convey this information. 

The quantization codebook used by each user is fixed beforehand and is known to the 
transmitter. A quantization codebook C consists of 2^ matrices in c^^^ i.e. (Wi, . . . , W2s), 
where B is the number of feedback bits allocated per user. The quantization of a channel 
matrix Hfe, say H^, is chosen from the codebook C according to the following rule: 



where d(Hfc, W) is the distance metric. Here, we consider the chordal distance [6]: 



yfc = H^x + nfe, k^l,...,K 



(1) 



Hfc = argmin (f (Hfe, W) 



(2) 



w e c 




(3) 
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where the O/s are the principal angles between the two subspaces spanned by the columns of 
the matrices and W [6]. As the principal angles depend only on the subspaces spanned 
by the columns of the matrices, it can be assumed that the elements of C are unitary matrices 
(i.e. W'^W = Iat V W G C), without loss of generality. An alternate form for the chordal 
distance is d"^ (H^, W) = — tr ^H^WW'^Hfcj, where Hk forms an orthonormal basis 
for the subspace spanned by H^. Note that other distance metrics may also be considered, 
but we do not investigate this further in this work. No channel magnitude information is fed 
back to the transmitter. 

III. Background 

A. Block Diagonalization 

The Block Diagonalization strategy, when perfect CSI is available at the transmitter, 
involves linear precoding that suppresses the interference at each user due to all other 
users (but does not suppress interference due to different antennas for the same user). If 
Ufc G C^^^ contains the A^ complex (data) symbols intended for the k^'^ {1 < k < K) user 
and Vfc G c*^^^ is the precoding matrix, then the transmitted vector is given by: 

K 

X = ^ VfcUfc (4) 

k=l 

and the received signal at the k^^ user is given by: 

K 

yfc = H^VfeUfe+ ^ H^V,u,+nfc (5) 

K 

The ^ H^VjUj term represents the multi-user interference at user k. In order to 

maintain the power constraint, it is assumed that V^V^ = 1^ and -Edlufelp] < for 
k = l,...,K. 

Following the BD procedure, each is chosen such that H^V^ is 0, VA; 7^ j. This 
amounts to determining an orthonormal basis for the left null space of the matrix formed 
by stacking all {11^}^^^ matrices together. This reduces the interference terms in equation 
^ to zero at each user. This is different from Zero Forcing where each complex symbol 
to be transmitted to the m}^ antenna (among the A^ antennas, i.e., m = 1, . . . , A^) of the /c"^ 
user is precoded by a vector that is orthogonal to all the columns of Hj, j 7^ k, as well as 
orthogonal to all but the column of H^.. 
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However, zero interference can only be achieved with perfect knowledge of {H.k}k=i at the 
transmitter. In the case of limited feedback, when only a quantized version of the subspace 
spanned by the columns of each is available at the transmitter, namely Hfe, we use a 
naive strategy where the precoding matrices are selected by treating Hi, ... , H^- as the true 
channels while performing BD. To distinguish these precoding matrices from those selected 
with perfect CSIT, we denote these matrices as Vi, . . . , Yk, where each is chosen such 
that H^Vfe — \/k j. Thus, H^^^V^ 7^ in general, which leads to residual interference 
terms and a loss in throughput. The received signal in the case of limited feedback is thus 
written as: 

K 

yjk = H^VfeUfe+ H^^,ai, + nfc (6) 

B. Random Quantization Codebooks 

Since the design of optimal quantization codebooks for the given distance metric is a 
very difficult problem, we instead study performance averaged over random quantization 
codebooks. The Grassmann manifold is the set of all dimensional subspaces (or planes) 
passing through the origin, in an M dimensional space. This is denoted by Qm,n- We consider 
complex Euclidean subspaces in this work. Each of the 2^ unitary matrices making up the 
random quantization codebook are chosen independently and are uniformly distributed over 
Qm,n [V] [8]. We alternatively refer to this uniform distribution as the isotropic distribution 
in the respective space. A random element drawn from this distribution (over Qm,n) can be 
generated by generating an M x N matrix with i.i.d. complex Gaussian elements and then 
forming a specific orthonormal basis for the N dimensional subspace spanned by the matrix 
(e.g., through a QR decomposition). 

We analyze the performance averaged over all possible random codebooks. The distortion 
or error associated with a given codebook C for the quantization of e qMxn defined 
as: 



min d^(H,,W) 



(7) 



where Hjt is the quantization of H^. It is shown in [7] that D < D where, 

D= £M(c^^)-^2-^ + iVexp [-(2^CMiv)'-1 ' (8) 

for a codebook of size 2^. Here, T — N(M — N) and a e (0, 1) is a real number between 
and 1 chosen such that (Cmn'^^) ^ < 1. Cmn is given by ^ ''{n-% • second 

i=l 
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(exponential) term in dH]) can be neglected for large B. For systems where N = 2 or 3, the 
exponential term may be neglected for most practical cases. 

IV. Analysis and Results 

In this section, we analyze the achievable throughput of the limited feedback-based system 
described so far. We first describe some preliminary mathematical results. 

A. Preliminary Calculations 

Lemma 1: The quantization of the channel admits the following decomposition: 

Hfc = HfeXfcYfc + SfcZfc (9) 

where 

Hfc G c*^^^ is an orthonormal basis for the subspace spanned by the columns of H^, 
Xfe G c^^^ is unitary and distributed uniformly over Qn,n, 

Zfc G c^^^ is upper triangular with positive diagonal elements, satisfying tr(Z^Zfc) = 

Yfe G c^^^ is upper triangular with positive diagonal elements and satisfies Y^Y^ = 
- Z^Zfc, and 

Sfc G c^^^^ is an orthonormal basis for an isotropically distributed (complex) N dimen- 
sional plane in the M — N dimensional left nuUspace of H^. 
Moreover, the quantities Y^, and are distributed independent of each other, as are 
the pair and Z^. This decomposition is a generalization of the decomposition in [5], which 
was for the specific case of = 1. Similar to [5], the matrix Z^ represents the quantization 
error. 

Proof: See Appendix HI ■ 

A direct application of Lemma [Hallows us to bound the rate loss due to limited feedback. 
This decomposition also allows us to perform low complexity Monte-Carlo simulations for 
evaluating the performance of random quantization codebooks, even for very large B, as 
described in detail in Section IV-CI 
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B. Throughput analysis for quantized feedback 

In the case of perfect CSIT and BD, the transmitter has the ability to suppress all inter- 
ference terms giving a per user ergodic rate of: 



-Rcsit-bd(-P) — E 



log2 



(10) 



where k is any user from 1, . . . , K. The expectation is carried out over the distribution of 

For limited feedback of B bits per user, multiuser interference cannot be completely 
canceled and this leads to residual interference power. The per-user rate (throughput) is 
given by: 

Rc 



iqvant{P) = E[/(ufc;yfc|Hfc)] 

= E 



E 



E 



log2 
log2 

logs 



K 



liv + 4 1 Itv + 4 Yl V^VHr, 



M 



P 



M 



(11) 



(12) 



K 



K 



(13) 



where k is any user between 1 and K and the expectation is carried out over the channel 
distribution as well as random codebooks C. 

Theorem 1: The rate loss incurred per user due to limited feedback with respect to perfect 
CSIT using Block Diagonalization can be bounded from above by: 



AR 



Quant 



[-Rcsit-bd(-P) — -Rquant(-P)] 



< l06 



Proof: See Appendix HIl ■ 

This provides a bound on the rate loss per usej]. Furthermore, D can be upper bounded 
tightly by D from ©. 

C. Controlling feedback quality 

If B is kept fixed and the SNR is taken to cxd, it is easy to see that residual interference 
will eventually overwhelm signal power, and this leads to a bounded throughput (i.e., zero 

'Note that a factor of A'^ was erroneously omitted from this bound when this result was stated in [9]. 
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Fig. 1. Sufficient number of bits for a gap of 3 dB relative to BD with perfect CSIT, for A'^ = 2 and M = 4, 6 and 8 



multiplexing gain). Therefore, it is of interest to determine how fast B must grow with SNR 
in order to prevent this behavior and to maintain a bounded rate loss relative to a perfect 
CSIT system. 

Theorem 2: In order to bound the per-user rate loss Ai?QuANT(-P) from above by log2(6) > 
0, it is sufficient for the number of feedback bits per user to be scaled with SNR as: 

B^ ^^""'-^^ PdB - N{M - N) log^{N{b^ - 1)) + 

^( N(M-N) ) 



N{M - N) log2 



-\0g,iCMN) (14) 



N{M-N) 

Proof: This expression can be found by equating the upper bound from Theorem [T] 
with log2 b and solving for B as a function of P. Solving this numerically will yield the 
number of bits sufficient for a maximum rate loss of log2 b. We assume that B is large enough 
to neglect the exponential term in the expression for D from ([8]), which yields the above 
approximation. ■ 
The total contribution of the term containing the logarithm of the gamma function is very 
small and can usually be neglected. To maintain a system throughput loss of M bps/Hz, 
which corresponds to an SNR gap of no more than 3 dB with respect to BD with perfect 
CSIT, it is sufficient to scale the bits as: 

B ^ - ^) p,^ _ log2(C;,^) (15) 
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where C^^^ = N'^^^'^~'^^C]\jn- Figure \T\ shows the sufficient number of bits required to 
maintain this level of performance, when N = 2 and M = 4, 6 and 8. 

The pre-log factor (i.e. the factor that multiplies the SNR in dB) is N{M — N) rather 
than MN, which is intuitively because the space of dimensional subspaces in an M 
dimensional space has a dimensionality of N{M — N) 

V. Performance Comparison and Numerical Results 

A. Zero forcing vs. Block diagonalization 

Zero forcing is simple low-complexity linear precoding strategy, and it is important to 
compare the performance of these two schemes under the presence of limited feedback. Zero 
forcing for a MIMO broadcast system with K users and antennas per user is equivalent 
to a KN = M user system with a single antenna per user. The feedback scaling law for 
such a system is derived in [5] to be: 

BzF ~ ^^^^^ PdB (16) 

to maintain an SNR gap of no more than 3 dB with respect to ZF under perfect CSIT 
conditions. In this system, each user with N antennas quantizes the direction of the channel 
vector (i.e. the channel vector normalized to have norm unity) of each of the N antennas 
separately, and feeds this back to the transmitter. 

In general, if BD with perfect CSIT achieves a sum rate of Rcsit-bd{P) with M, an- 
tennas at the transmitter and each of the ^ users respectively, and ZF achieves Rcsit-zf{P) 
for the same system, Rcsit-bd{P) will eventually dominate Rcsit-zf{P) by a constant 
amount. Thus, we see an immediate advantage of BD with respect to ZF from (fTSi) . where 
the pre-log factor for BD is N{M — N) for A^ antennas, or M — N per user antenna. This 
is compared to the factor M — 1 in (fT6l) . which is for a lower target rate. This difference 
between M — 1 and M — A^ is perhaps due to the fact that the space of A^ dimensional 
subspaces in an M dimensional space has a dimensionality of N(M — N) while the space 
of A^ one-dimensional subspaces in an M dimensional space has dimensionality N{M — 1). 

The rate gap between BD and ZF with perfect CSIT is given by [4]: 

^ N-i 
i=i 

at high SNR. For fair comparison of the number of bits required for BD and ZF under 
imperfect CSIT and limited feedback, it is necessary to fix a common target rate. By setting 
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b = 2^s(^)+^ in (fT4l) where Rg{P) is the (per-user) rate gap between BD and ZF (with perfect 
CSIT) and R the target (per-user) rate loss for the ZF system, we can compare the sufficient 
number of bits required to achieve the same sum rate for both strategies. For example, R = 1 
for a 3 dB target offset in SNR, relative to rate achievable with ZF and perfect CSIT. This 
suggests a bit savings of 48% for an M = 6, N = 2 system at 15 dB, and 63% for an 
M = 9, = 3 system with BD. The scaling law in Theorem [21 is slightly conservative for 
large b, and the advantage of BD is somewhat underestimated. Numerical results show that 
the bit savings possible with BD are even higher. 

An alternative antenna combining method (when the users have multiple antennas) is 
proposed in [10], where each user receives only a single stream of data (as opposed to 
streams of data with BD), but uses the extra antennas to obtain a very accurate quantization of 
the effective channel. This effectively allows for a reduction in feedback load, and produces 
the same pre-log factor as BD, i.e., N(M — N), but needs N times the number of users in 
the system (i.e. K = M where each user as antennas, rather than the K = ^ for BD). 
Table HI compares the sufficient number of bits required to achieve the same target rate, i.e., 3 
dB (in SNR) away from ZF with perfect CSIT, when using BD, ZF and Antenna combining 
for an M = 6, iV = 2 system. ZF and BD have K = 3, while antenna combining has K = 6. 



SNR 


Block Diagonalization 


Zero Forcing 


Antenna Combining 


5 dB 


1 


9 


8 


10 dB 


7 


17 


15 


15 dB 


13 


25 


21 


20 dB 


20 


34 


28 


25 dB 


26 


42 


35 


30 dB 


33 


50 


41 



TABLE I 

Feedback requirement (bits) for different multiple user-antenna strategies (M = 6, = 2) 
B. Analog Feedback 

We consider here the case when each user k feeds back its channel Hfc by explicitly 
transmitting the MN complex coefficients (H^),^,^ ,m = 1, . . . M,n = 1, . . . , N over the 
feedback channel. We assume that the uplink feedback channel is unfaded AWGN with the 
same SNR as the downlink (i.e., P). Each user may transmit each coefficient effectively '/?' 
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times on the uplink, resulting in the following matrix being received at the transmitter: 

Gk = ^/WUk + Nk. (18) 

Here, represents the feedback (additive white Gaussian) noise, whose entries are indepen- 
dent and complex Gaussian with unit variance. As the coefficients of Hk are also independent 
and complex Gaussian with unit variance, the optimal estimator is the MMSE estimator: 



(19) 



I + PP 

where Hjt is the estimate of formed at the transmitter. It is convenient to express in 
terms of the estimate and estimation noise as follows: 

1 



Hi 



(20) 



where the entries of are also independent and complex Gaussian with unit variance, and 
independent of the estimator. 

The beamformers {V^}^^ are selected by treating {H^}^^ as the 'true' set of channels, 
and following the BD procedure. Note that the marginal distribution of the beamformers are 
the same as in the quantized feedback case, as the addition of independent white Gaussian 
noise does not affect the isotropic property. As in the case for quantized (digital) feedback, 
we compute the quantity: 

1 



(21) 



for k 7^ j, which follows from the fact that H^Vj = for A; 7^ j. Similar to (fT3]) . we write 
the rate with 'analog' feedback as follows: 



R 



Analog 



log2 



P 



K 



- E 



log2 



P 



K 



Similar to the proof of Theorem [T] and using techniques similar to those in [11], we 
compute a bound on the rate gap relative to BD with perfect CSIT to be: 



< N log2 (^1 + 

< N log2 (1 + 



M I + PP 
M -Nl 



^-Ranalog(-P) — [-Rcsit-bd(-P) — -Ranalog(-P)] (23) 

/ M - N P \ 

(24) 

M PJ ^''^ 
The proof (|24|) bound is given in Appendix |llll (1251) is obtained by letting P ^ 00 in (|24l) 



(22) 
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In order to compare analog and quantized feedback, we measure the feedback quantity in 
terms of 'feedback symbols' rather than bits. Although analog feedback involves effectively 
PMN channel uses per user (assuming that the users have orthogonal feedback channels), it 
also conveys more information that the quantized case, specifically information regarding the 
eigenvalues and eigenvector structure, which the 'subspace' information does not capture. 

Hence, for fair comparison, we equate the PMN analog channel uses to PN{M — N) 
channel symbols in the quantized case (the 'subspace' information may be specified by 
N(M — N) complex numbers). Under the simplifying assumption that error-free communi- 
cation at capacity is possible, we set B = PN{M — N) log2(l + P) for PN{M — N) channel 
uses of the AWGN feedback channel with SNR P. From Theorem [H we have: 

AW(P) < + (26) 

= N\ogJl + -^^C'l,^] (27) 



where D has been bounded from ([8]) (neglecting the exponential term), and 

r((iv(M-iv))-i) (^(M-;v))-i 

^MN- N^{M-N) ■ ^ ' 

Our conclusions are similar to the = 1 case, which was considered in [12]. For /? ~ 1, 
both bounds on the rate gap (i.e. for analog and quantized feedback) behave similarly, and 
the gap does not vanish as P ^ oo. For (3 > \, the rate gap bound decreases rapidly 
(exponentially fast) for quantized feedback, and vanishes entirely as P — > oo. However, for 
analog feedback, the decrease is relatively slow (i.e. only polynomially fast) and does not 
vanish as P ^ oo. The analysis may also be extended to the case when errors occur with 
quantized feedback, using techniques similar to those in [12]. 

C. Generation of Numerical Results 

The number of bits given by (fT4l) can be very large and numerical simulation becomes a 
computationally complex task, as the chordal distance will have to be calculated for each of 
the 2^ matrices in the codebook. However, utilizing the statistics of random codebooks, the 
quantization procedure can be precisely emulated without having to do actual quantization. 
From Lemma [U we can repeat the argument by interchanging H^. and H^, to yield the 
following equivalent decomposition: 

Hfc = HfcXfcYfc + SfcZfc (29) 
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which can be used to generate H^, given and a codebook size. is isotropic and 
independent of the codebook size, as is which (in this decomposition) is isotropically 
distributed in the left nullspace of H^. Samples drawn from the distribution of these matrices 
can thus be generated as samples from the isotropic distribution in their respective spaces. 

Moreover, d"^ (iik, H^j = tr (Z^Z^t) is the P' order statistic from 2^ samples. Here, each 
sample is drawn from the distribution of the trace of a matrix-variate beta distribution (as 
described in Appendix U). Thus, a sample drawn from the distribution of tr (Z^Z^) can be 
generated by the 'CDF inversion' method, by computing the CDF for a specific M and A^. 
A general expression for the CDF has been computed in closed form in [7], for the case 
when (P (iik, Hfc j < 1. For moderate to large B and practical values of M, A^, this event 
occurs with extremely high probability, allowing for low complexity CDF inversion. For very 
small values of B, d? ^H^, j may be greater than 1 with appreciable probability, but an 
exhaustive searching among 2^ possibilities is not a problem in these cases. 

From the eigen decomposition Z^Z^ = E^DfeE^, as described in Appendix |Ill E^ can be 
generated as the eigenvectors of any (complex) Beta(A^, M — N) distributed matrix. Further, 
the distribution of the eigenvalues (i.e., the entries of Djt) conditioned on their sum (which is 
equal to c?^(Hfc, H^)), can be computed from their joint distribution [13] ([7] for the complex 
case). The conditional distribution can be easily computed for small values of A^. 

In particular, for A^ = 2, if Di, D2 are the diagonal elements of with joint density 
/di.D2('^1) '^2)5 the distribution of Di conditioned on Z = Di + D2 < I is given as: 

z 

J fD^,D2{di, z - di) d{di) 
FD,\z{dA^) = TT-\ (30) 



z 



J Vm{z - 2rfi)2(l - d^Y'-\\ -z + rfi)^-4 d{di 







fz{z) 

where fz{z) is the pdf of Z computed to be: 



(31) 



(M-l)r(2M-4) ^ ^ 

for 2; < 1. Vm is a normalizing constant and is given by Vm = |(M — 1)(M — 2)^(M — 3). 
For efficient CDF inversion, FE)-^\z{di\z) can be computed in closed form for specific values 
of M. 

As Y^Yfc = 1n — ZfcZ^, Yfc can be obtained as well. Putting all this together, one is able 
to randomly generate a realization of the quantized version of H^, when random codebooks 
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are used. This prevents the computational complexity from growing with B. However, for 
extremely large B, numerical errors may dominate and care must be taken to maintain 
numerical precision. 

D. Numerical Results 



M = 4, N = 2, K = 2 

35 r 




°0 5 10 15 20 25 30 

SNR (dB) 



Fig. 2. MIMO Broadcast Channel with M = 4,N = 2,K ^ 4 

We present numerical results for N = 2 and M = 4, 6, 8 in Figures [21 S Irrespectively, 
while scaling the bits as per (fT5l) . i.e. with a target of staying at most 3 dB away (in SNR) 
from BD with perfect CSIT. As Theorem [21 only provides the sufficient number of bits, this 
is a conservative strategy and the actual SNR gaps are found to be 2.65 dB, 2.72 dB and 
2.84 dB for M = 4, 6 and 8 respectively, instead of 3 dB. The results also show that keeping 
the number of bits fixed will result in a rate gap that increases unbounded with SNR. 

VL Conclusion 

Accurate CSIT is clearly important for MIMO broadcast systems in order to achieve 
maximum throughput. When the receiver knows the channel perfectly and instantaneously 
feeds this information back to the transmitter using a finite number of bits, we have quantified 
the rate loss and have shown that increasing the number of bits linearly with the system SNR 
is sufficient to maintain a constant SNR loss with respect to perfect CSIT. Further, we have 
established the advantage of BD relative to ZF in terms of feedback load, and the advantage 
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M = 6, N = 2, K = 3 




- - - ' Fixed B = 1 bits 

°0 5 10 15 20 25 30 

SNR (dB) 

Fig. 3. MIMO Broadcast Channel with M = 6,N = 2,K = 4: 

of using quantized feedback as opposed to using analog feedback. Note that BD is just one 
of many linear precoding techniques that can be used on the MIMO broadcast channel with 
multiple user antennas (for e.g., see coordinated beamforming [14] and Multiuser Eigenmode 
Transmission [15]). It remains to be seen which of these perform best in a limited feedback 
setting and also when multiuser diversity/user selection is considered. 

Appendix I 
Proof of Lemma H] 

Let W be any arbitrary matrix in the codebook C. Note that W is independent of Hfc. 
We then decompose into components that lie in the column space of W and the left 
nuUspace of W as follows: 

H, = WW^H, + (Im - WW^) Hfc (33) 
= WW^Hfc + W^(W^)^Hfc (34) 

where WW'^ and W-'-(W-'-)'^ = 1^/ — WW'^ are the projection matrices for the column 
space and left nuUspace of W respectively. W-*- G c^'^>^i^'^~^) [$ chosen such that it forms 
an orthonormal basis for the left nuUsapce of W. 

Let the (thin) QR decomposition of WW^Hfc be QfcA^ where Q^. G c^^""^ forms an 
orthonormal basis for the same space as W, and G c^^^ is upper triangular with positive 
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Fig. 4. MIMO Broadcast Channel with M = 8, N = 2, K ^ 4 

diagonal elements. Further, and are independent, from [16, Theorem 2.3.18] (after 
verification for the complex case). As and W describe the same subspace, may be 
represented as a rotation of W, i.e., = WX/,. for some unitary matrix G C^^^ . 

By isotropy and independence of W and H^, X^ is also isotropically distributed and is 
independent of W, which is an arbitrary orthonormal basis. Also note that WW'^ = Qa,.Q^ 
and hence A^^ = H^^WWHfifc. Thus tr (A^ A^) = N - (w, Hfc) . 

Note that W^(W-'-)'^Hfc is the projection of onto the left nuUspace of W. As is 
isotropically distributed, the projection is also isotropically distributed in the corresponding 
M — N dimensional nuUspace. Let the (thin) QR decomposition of W-'-(W-'-)'^Hfc be SfcBjt, 
where Sfc G c^^^^ is an orthonormal basis for an isotropically distributed (complex) 
dimensional plane in the M — N dimensional left nuUspace of W and G c^^^ is 
upper triangular with positive diagonal elements. Similar to the previous case, and 
are independently distributed. It is also straightforward to see that B^B^ = 1^ — A^A^ and 
tr(BHBfc) =rf2 (w,Hfc). 

As Hfc and W are independent, which has been our assumption thus far in the proof, 
B^Bfc is matrix-variate (complex) Beta(A^, M - A^) distributed [13]. We wiU now argue that 
most of the above conclusions remain unchanged, even when the quantization procedure Q 
is followed. 

The quantization procedure amounts to choosing a B^B^ such that its trace is the minimum 
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among 2^ choices. Thus, it follows that the quantization procedure only affects (and A^, 
which is the 'inverse' quantization error and is related to by A^A^ = I^r — B^B^). We 
use Yk and to denote the matrices Ak and B^ after following the quantization procedure. 
Hence, even though Z^Z^ is not beta distributed, the distribution of the quantities X^, 
and W remain the same, and are independent of Z^ (and Y^). We now use to denote 
W after following the quantization procedure, according to the convention in (O. 



Appendix II 
Proof of Theorem □ 



Theorem [T] is proved as follows: 
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Here, (a) follows by neglecting the positive semi-definite interference terms in the quantity: 

K 



E 



log2 



M 



By the BD procedure, both and are distributed isotropically, and are chosen indepen- 
dent of Hfc, which resuks in (b). We write H^H^ = HfcAfeH^^, where Hfe G c^'^''^ forms an 
orthonormal basis for the subspace spanned be the columns of and A^ = diag[Ai, . . . , A^r] 
are the non-zero, unordered eigenvalues of H^H^ (H^t is of rank and diagonalizable 
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with probability 1). Both the density function of (which is matrix-variate complex Normal 
distributed) [16] and the Jacobian of the singular value decomposition transformation of a 
matrix [17] can be separated into a product of functions of and alone. Thus, 
and Afc are independent and E [A^] = MI^v- Step (c) follows using this and the fact that 
|I + AB| = |I + BA|, for matrices A and B. Next, (d) follows from Jensen's inequality due 
to the concavity of log | ■ |. Step (e) is proved as follows. First, we compute 



(42) 
(43) 



for k 7^ j, which follows from Lemma [D and the fact that H^Vj = V/c 7^ j, by the BD 
procedure. Therefore, 



log2 



I;v + PiK - 1] 



(f) 



(g) 



log2 
log2 

logs 



I;v + P{K - 1)E [Z^ ( S^V,V,HSfe ) Zfc 



N 



P{K-l] 
lr, + P{K-l] 



M-N 
D N 



1 • J 
E [Z.^Zfc] 



N M-N 



(44) 
(45) 



Here, (f) follows from the fact that Vj (which is just isotropically distributed in the left 
nuUspace of H^) and Z^ are independent, as are Sa,. and Z^ from Lemma [B Further, is 
also isotropically and distributed in the left nuUspace of H^, and is independent of V^. Thus 
V^SfeS^Vj is matrix-variate Beta(iV, M-2N) distributed [16], and E Z^ (^S^ V^Sfc) Z^ 
^^£^E [Z^Zfc], by [16, Theorem 5.3.12] and [16, Theorem 5.3.19] (after verification for the 
complex case). 

Let EfcDfcE^ be the eigen decomposition of Z^Z^, where G c^^^ is orthonormal and 
Dfc G c^^^ is diagonal, with strictly positive elements along the diagonal. If an arbitrary 
matrix in the codebook C is selected as the quantization, Z^Z^ is matrix-variate (complex) 
Beta(iV, M — A^) distributed (as described in Appendix HI), and E [Z^Z^] is a multiple of 
the identity matrix. Both the density function of this distribution [16] and the Jacobian of 
the eigen decomposition transformation for a matrix [17] can be separated into a product of 
functions of E^ and alone, and these are hence independently distributed. 

For the actual quantization matrix, after following the procedure in (|2l), only the distribution 
of the diagonal matrix is affected, and the distribution of E^ remains unchanged and 
independent of D^. Thus, we have that E [Z\!Zk\ = pIn for some constant p, even after 
following the quantization procedure. This can also be concluded by observing that Z^Z^ is 
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invariant to unitary rotations. In terms of the trace of the matrix, we have p = — — 



jj, and (g) follows. 



Appendix III 
Proof of equation (|24]) 
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M 1+pP^ 

Here, (a) and (b) have the same justification as in the proof of Theorem [U (in Appendix Ull), 
(c) follows from (|2T)) . and (d) is obtained by applying Jensens inequality. By Gaussianity of 
Ffc and independence of F^ and Yj, F^V^ is matrix- variate complex Gaussian distributed 



with i.i.d. elements, and 



FHV,VHF^ 



A^Iiv, which results in (e). 
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